Workshop Notes of the ECML / MLnet Workshop on Empirical Learning of Natural Language Processing Tasks
نویسنده
چکیده
When applied to probabilistic categorial grammar learning, the Minimum Description Length principle outperforms Maximum Likelihood Estimation. Smoothing does not bridge the gap between the two approaches.
منابع مشابه
Workshop Notes of the ECML / MLnet Workshop on Empirical Learning of Natural Language Processing Tasks
This paper presents a method for learning eecient parsers of natural language. The method consists of an Explanation-Based Learning (EBL) algorithm for learning partial-parsers, and a parsing algorithm which combines partial-parsers with existing \full-parsers". The learned partial-parsers, implementable as Cascades of Finite State Transducers (CFSTs), recognize and combine constituents eecient...
متن کاملWorkshop Notes of the ECML / MLnet Workshop on Empirical Learning of Natural Language Processing Tasks
We present a modiied version of the Transformation-Based Approach (TBA) and Transformation-Based Error-Driven Learning (TBEDL). We mod-iied the TBA in order to work with a dependency tree structure, which describes more eeciently the syntax of innective and free-word order languages, such as the Czech language. The major changes and characteristics are described in more detail: they mostly conc...
متن کاملWorkshop Notes of the ECML / MLnet Workshop on Empirical Learning of Natural Language Processing Tasks
The morphological lexicon is an important part of NLP systems which is typically handwritten with the help of linguist experts. Even a partial automation of this process could decrease the cost of the lexicon, being of theoretical importance for languages and dialects which have not been well analysed yet. In this work we describe an attempt to use the minimal description length (MDL) as the on...
متن کاملWorkshop Notes of the ECML / MLnet Workshop on Empirical Learning of Natural Language Processing Tasks
This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-oo smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can ooer the advantage of automatically specifying a suitable domain-speciic hierarchy between most speciic and...
متن کاملNotes of the ECML / MLnet Workshop on Empirical Learning of Natural
The evolution of language can only be explained when we take a language learning process into account which is necessarily imperfect due to weak data and the limits of induction. Thus language change yields clues on what the learning processes are and conversely hypothesised learning processes should predict possible language changes. This paper considers this issue by studying lexicon formatio...
متن کامل